智能论文笔记

Interpretable estimation of the risk of heart failure hospitalization from a 30-second electrocardiogram

Sergio González , Wan-Ting Hsieh , Davide Burba , Trista Pei-Chun Chen , Chun-Li Wang , Victor Chien-Chia Wu , Shang-Hung Chang

分类：机器学习

2022-11-02

Survival modeling in healthcare relies on explainable statistical models; yet, their underlying assumptions are often simplistic and, thus, unrealistic. Machine learning models can estimate more complex relationships and lead to more accurate predictions, but are non-interpretable. This study shows it is possible to estimate hospitalization for congestive heart failure by a 30 seconds single-lead electrocardiogram signal. Using a machine learning approach not only results in greater predictive power but also provides clinically meaningful interpretations. We train an eXtreme Gradient Boosting accelerated failure time model and exploit SHapley Additive exPlanations values to explain the effect of each feature on predictions. Our model achieved a concordance index of 0.828 and an area under the curve of 0.853 at one year and 0.858 at two years on a held-out test set of 6,573 patients. These results show that a rapid test based on an electrocardiogram could be crucial in targeting and treating high-risk individuals.

translated by 谷歌翻译

Transition Motion Tensor: A Data-Driven Approach for Versatile and Controllable Agents in Physically Simulated Environments

Jonathan Hans Soeseno , Ying-Sheng Luo , Trista Pei-Chun Chen , Wei-Chao Chen

分类：机器人 | 机器学习

2021-11-30

本文提出了过渡动作张量，一种数据驱动的框架，它在运动数据集之外创建新颖和物理准确的转换。它使模拟字符能够有效且强大地采用新的运动技能而无需修改现有问题。考虑到几种专门从事不同运动的物理模拟的控制器，张量用作它们之间的过渡的时间指南。通过查询最佳拟合用户定义的偏好的转换的Tensor，我们可以创建一个能够产生新颖的转换和解决可能需要多个动作的复杂任务的统一控制器。我们在Quadrupeds和Biped上应用框架，对转换质量进行定量和定性评估，并在遵循用户控制指令时展示其解决复杂运动计划问题的能力。

translated by 谷歌翻译

Facial Image Reconstruction from Functional Magnetic Resonance Imaging via GAN Inversion with Improved Attribute Consistency

Pei-Chun Chang , Yan-Yu Tien , Chia-Lin Chen , Li-Fen Chen , Yong-Sheng Chen , Hui-Ling Chan

分类：计算机视觉 | 机器学习

2022-07-03

神经科学研究表明，大脑编码视觉内容并将信息嵌入神经活动中。最近，深度学习技术通过将大脑活动映射到使用生成的对抗网络（GAN）来刺激来解决视觉重建的尝试。但是，这些研究都没有考虑图像空间中潜在代码的语义含义。省略语义信息可能会限制性能。在这项研究中，我们提出了一个新框架，以从功能磁共振成像（fMRI）数据中重建面部图像。使用此框架，首先将GAN倒置用于训练图像编码器以在图像空间中提取潜在代码，然后使用线性转换将其桥接到fMRI数据中。遵循从fMRI数据使用属性分类器确定的属性，确定操纵属性的方向，属性操纵器调整了潜在代码，以提高可见图像和重建图像之间的一致性。我们的实验结果表明，提出的框架实现了两个目标：（1）从fMRI数据中重建清晰的面部图像，以及（2）保持语义特征的一致性。

translated by 谷歌翻译

Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

Yang Trista Cao , Anna Sotnikova , Hal Daumé III , Rachel Rudinger , Linda Zou

分类：自然语言处理

2022-06-23

已显示在文本上训练的NLP模型可以重现人类的刻板印象，当系统大规模部署系统时，可以放大边缘化组的危害。我们适应了Koch等人的代理 - 信号 - 局势（ABC）刻板印象模型。（2016年）从社会心理学作为系统研究和发现语言模型（LMS）中刻板印象群体特征关联的框架。我们介绍了用于测量语言模型的刻板印象关联的灵敏度测试（集合）。为了使用ABC模型评估集合和其他措施，我们从美国受试者那里收集小组特征判断，以与英语LM刻板印象进行比较。最后，我们扩展了此框架以测量相互切换身份的LM定型观念。

translated by 谷歌翻译

Generative appearance replay for continual unsupervised domain adaptation

Boqi Chen , Kevin Thandiackal , Pushpak Pati , Orcun Goksel

分类：计算机视觉 | 人工智能

2023-01-03

Deep learning models can achieve high accuracy when trained on large amounts of labeled data. However, real-world scenarios often involve several challenges: Training data may become available in installments, may originate from multiple different domains, and may not contain labels for training. Certain settings, for instance medical applications, often involve further restrictions that prohibit retention of previously seen data due to privacy regulations. In this work, to address such challenges, we study unsupervised segmentation in continual learning scenarios that involve domain shift. To that end, we introduce GarDA (Generative Appearance Replay for continual Domain Adaptation), a generative-replay based approach that can adapt a segmentation model sequentially to new domains with unlabeled data. In contrast to single-step unsupervised domain adaptation (UDA), continual adaptation to a sequence of domains enables leveraging and consolidation of information from multiple domains. Unlike previous approaches in incremental UDA, our method does not require access to previously seen data, making it applicable in many practical scenarios. We evaluate GarDA on two datasets with different organs and modalities, where it substantially outperforms existing techniques.

translated by 谷歌翻译

MGTAB: A Multi-Relational Graph-Based Twitter Account Detection Benchmark

Shuhao Shi , Kai Qiao , Jian Chen , Shuai Yang , Jie Yang , Baojie Song , Linyuan Wang , Bin Yan

分类：计算机视觉

2023-01-03

The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.

translated by 谷歌翻译

Explaining Imitation Learning through Frames

Boyuan Zheng , Jianlong Zhou , Chunjie Liu , Yiqiao Li , Fang Chen

分类：机器学习 | 计算机视觉

2023-01-03

As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.

translated by 谷歌翻译

Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

Liqun Lin , Yang Zheng , Weiling Chen , Chengdong Lan , Tiesong Zhao

分类：计算机视觉

2023-01-03

Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques.

translated by 谷歌翻译

Risk-Averse MDPs under Reward Ambiguity

Haolin Ruan , Zhi Chen , Chin Pang Ho

分类：机器学习

2023-01-03

We propose a distributionally robust return-risk model for Markov decision processes (MDPs) under risk and reward ambiguity. The proposed model optimizes the weighted average of mean and percentile performances, and it covers the distributionally robust MDPs and the distributionally robust chance-constrained MDPs (both under reward ambiguity) as special cases. By considering that the unknown reward distribution lies in a Wasserstein ambiguity set, we derive the tractable reformulation for our model. In particular, we show that that the return-risk model can also account for risk from uncertain transition kernel when one only seeks deterministic policies, and that a distributionally robust MDP under the percentile criterion can be reformulated as its nominal counterpart at an adjusted risk level. A scalable first-order algorithm is designed to solve large-scale problems, and we demonstrate the advantages of our proposed model and algorithm through numerical experiments.

translated by 谷歌翻译

Policy Pre-training for End-to-end Autonomous Driving via Self-supervised Geometric Modeling

Penghao Wu , Li Chen , Hongyang Li , Xiaosong Jia , Junchi Yan , Yu Qiao

分类：计算机视觉

2023-01-03

Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.

translated by 谷歌翻译